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ABSTRACT 



The National Household Education Survey (NHES) is a data 
collection system of the National Center for Education Statistics (NCES) , 
which has as its mission the collection and publication of data on the 
condition of education in the United States by providing information on those 
issues that are best addressed by contacting households rather than 
educational institutions. The NHES is a telephone survey of the 
noninstitutionalized civilian population of the United States. This paper 
addresses issues associated with the bias that might arise in estimates from 
the 1993 NHES because only households with telephones were sampled, and it 
assesses the data quality of recorded interviews from the survey. The focus 
in the study of bias is on the potential for bias in statistics for 3- to 
7 -year-olds corresponding to the School Readiness component population of the 
NHES: 93. The analysis of undercoverage bias shows that the coverage bias for 
statistics on this age group is not large, even though large differences are 
reported for children living in telephone or nontelephone households. Results 
of the study of recorded interview data coverage indicate that the majority 
of the questions were read as written by the interviewer and respondents 
provided a codable response. Specific items that resulted in a greater than 
average number of problems are identified. Four appendixes contain the 
interview coding forms and frequency of rating information for the School 
Readiness Questionnaire, the School Safety and Discipline Parent 
Questionnaire, and the School Safety and Discipline Youth Questionnaire. 
(Contains 1 exhibit, 2 figures, and 17 tables.) (SLD) 



fS 

m 

rs 

VO 



s 




WorldrigPi^eF Series 




Telephone Coverage Bias 
and Recorded Interviews in the 
1993 National Household Education Survey (NHES:93) 






Working Paper No. 97-02 



February 1997 










edStIonTl'S 

CENTER (ERIC) 

receired'fmm ''®Praduced as 

□ Minor changes have been made lo 
improve reproduction quality. 

" dSm^nt"dTno, 



necessarily represent 



Official OERI position or policy. 



U.8. D^Milment of EciueidkM 

Offles of Edueational RMoarcIi and linpiowment 




BEST COPY AVAILABLE 




Telephone Coverage Bias 
and Recorded Interviews in the 
1993 National Household Education Survey (NHES:93) 



Working Paper No. 97-02 



February 1997 



Contact: Kathryn Chandler 

Surveys and Cooperative Systems Group 
(202) 219-1767 
e-mail: nhes@ed.gov 



U.S. Department of Education 

Richard W. Riley 

Secretary 

Office of Educational Research and Improvement 
Sharon P. Robinson 
Assistant Secretary 

National Center for Education Statistics 
Pascal D. Forgione, Jr. 

Commissioner 

Surveys and Cooperative Systems Group 
Paul D. Planchon 
Associate Commissioner 



The National Center for Education Statistics (NCES) is the primary federal entity for collecting, analyzing, 
and reporting data related to education in the United States and other nations. It fulfills a congressional 
mandate to collect, collate, analyze, and report full and complete statistics on the condition of education 
in the United States; conduct and publish reports and specialized analyses of the meaning and significance 
of such statistics; assist state and local education agencies in improving their statistical systems; and review 
and report on education activities in foreign countries. 

NCES activities are designed to address high priority education data needs; provide consistent, reliable, 
complete, and accurate indicators of education status and trends; and report timely, useful, and high quality 
data to the U.S. Department of Education, the Congress, the states, other education policymakers, 
practitioners, data users, and the general public. 

We strive to make our products available in a variety of formats and in language that is appropriate to a 
variety of audiences. You, as our customer, are the best judge of our success in communicating 
information effectively. If you have any comments or suggestions about this or any other NCES product 
or report, we would like to hear from you. Please direct your comments to: 

National Center for Education Statistics 
Office of Educational Research and Improvement 
U.S. Department of Education 
555 New Jersey Avenue, NW 
Washington, DC 20208 



Suggested Citation 

U.S. Department of Education. National Center for Education Statistics. Telephone Coverage Bias and Recorded Interviews in 
the 1993 National Household Education Survey (NHES: 93), Working Paper No. 97-02, by J. Michael Brick, Ellen Tubbs, Mary 
A. Collins, Mary Jo Nolin, David Cantor, Kerry Levin, and Yuki Carnes. Project Officer, Kathryn Chandler. Washington, D C • 
1997. 



February 1997 



Foreword 



Each year a large number of written documents are generated by NCES staff and 
individuals commissioned by NCES which provide preliminary analyses of survey results and 
address technical, methodological, and evaluation issues. Even though they are not formally 
published, these documents reflect a tremendous amount of unique expertise, knowledge, and 
experience. 

The Working Paper Series was created in order to preserve the information contained 
in these documents and to promote the sharing of valuable work experience and knowledge. 
However, these documents were prepared under different formats and did not undergo vigorous 
NCES publication review and editing prior to their inclusion in the series. Consequently, we 
encourage users of the series to consult the individual authors for citations. 

To receive information about submitting manuscripts or obtaining copies of the series, 
please contact Ruth R. Harris at (202) 219-1831 or U.S. Department of Education, Office of 
Educational Research and Improvement, National Center for Education Statistics, 555 New 
Jersey Ave., N.W., Room 400, Washington, D.C. 20208-5654. 



Susan Ahmed 

Chief Mathematical Statistician 
Statistical Standards and 
Services Group 



Samuel S. Peng 
Director 

Methodology, Training, and Customer 
Service Program 



Telephone Coverage Bias and Recorded Interviews in the 
1993 National Household Education Survey (NHES:93) 



Prepared by: 

J. Michael Brick 
Ellen Tubbs 
Mary A. Collins 
Mary Jo Nolin 
David Cantor 
Kerry Levin 
Yuki Carnes 

Westat, Inc. 



Prepared for: 

U.S. Department of Education 
Office of Educational Research and Development 
National Center for Education Statistics 



February 1997 



ft 



Table of Contents 



ft 









ft 



Section 

Foreword 

1. Overview of the National Household Education Survey 

NHES:93 Design 

2. Telephone Coverage Bias in the NHES:93 

Telephone Coverage and Bias 

Estimated Differences Between Telephone and Nontelephone Households . 

Statistical Adjustments of the Estimates 

Estimates of Coverage Bias 

Conclusions 

References 

3. An Assessment of Data Quality from Recorded Interviews 

Overview 

Background 

Method 

Findings 

Implications 

References 



Appendices 



Appendix A: 
Appendix B: 
Appendix C: 

Appendix D: 



Recorded Interview Coding Forms 

Frequency of Rating for School Readiness Questioimaire (SR) . . . . 
Frequency of Rating for School Safety and Discipline Parent 

Questionnaire (SS&D-P) 

Frequency of Rating for School Safety and Discipline Youth 
Questioimaire (SS&D-Y) 



List of Tables 



ft 



$ 



2-1. Estimated percentage of 3- to 7-year-olds in telephone and 

nontelephone households who have specific characteristics .... 

2-2. Relative raking adjustment factors for NHES:93 School Readiness 
component, by race/ethnicity and family income 



ERIC 



vii 



Page 

iii 

1-1 

1-2 

2-1 

2-1 

2-3 

2-4 

2-6 

2-8 

2- 9 

3- 1 

3-1 

3-1 

3-3 

3-9 

3-17 

3-18 



A-1 

B-1 

C-1 

D-1 



2-10 



2-11 



7 



TABLE OF CONTENTS-Continued 






2-3. Estimated percentage of 3- to 7-year-olds in all households who 
have specific characteristics, adjusted estimates based on raking 
only children in telephone households, and the bias of the estimates 
before and after adjustment 2-12 

2-4. Estimated percentage of 3- to 7-year-olds in all households who 
have specific characteristics, adjusted estimates based on raking 
only children in telephone households, and the bias of the adjusted 



estimates, by race/ethnicity 2-13 

2-5. Estimated percentage of 3- to 5-year-olds in telephone and 

nontelephone households who engaged in specific activities with 

family members 2-16 

2-6. Relative raking adjustment factors for NHES:91 Early Childhood 

Education component, by race/ethnicity and family income 2-17 

2-7. Estimated percentage of 3- to 5-year-olds in all households who 
engaged in specific activities with family members and adjusted 
estimate based on raking only 3- to 5-year-olds in telephone households . . 2-18 

2- 8. Estimated percentage of 3- to 5-year-olds in all households who engaged in 

specific activities with family members and adjusted estimate based on 

raking only 3- to 5-year-olds in telephone households, by race/ethnicity . . 2-19 

3- 1. Overall level of agreement (interviewer and respondent) of ratings 3-7 

3-2. Level of agreement of ratings for interviewer behavior 3-7 

3-3. Level of agreement of ratings for respondent behavior 3-7 

3-4. Number of exact/minor codes by rater and form 3-8 

3-5. Overall level of agreement of ratings after collapsing "minor" and 

"exact" codes 3-8 

3-6. Level of agreement of ratings for interviewer behavior after collapsing 

"minor" and "exact" codes 3-8 

3-7. Total number of ratings per rating category 3-9 

3-8. Total number of codes given by form 3-10 

3-9. Frequency of rating on introductions 3-12 



41 



i 



•i 






i 






4 




8 



t 



TABLE OF CONTENTS-Continued 



List of Figures 



2-1. Telephone coverage of adults from 1988 to 1992 2-2 

2-2. Telephone coverage of adults in 1992 by race/ethnicity 2-2 



List of Exhibits 

3-1. Behavior coding indicator definitions 3-4 






t 






ERIC 



IX 



1. Overview of the National Household Education Survey 



The National Household Education Survey (NHES) is a data collection system of the National 
Center for Education Statistics (NCES), which has as its legislative mission the collection and 
publication of data on the condition of education in the Nation. The NHES is specifically designed to 
support this mission by providing information on those educational issues that are best addressed by 
contacting households rather than schools or other educational institutions. The NHES provides 
descriptive data on the educational activities of the U.S. population and offers policymakers, 
researchers, and educators a variety of statistics on the condition of education in the United States. 

The NHES is a telephone survey of the noninstitutionalized civilian population of the U.S. 
Households are selected for the survey using random digit dialing (RDD) methods, and data are 
collected using computer-assisted telephone interviewing (CATI) procedures. About 45,000 to 64,000 
households are screened for each administration, and individuals within households who meet 
predetermined criteria are sampled for more detailed or extended interviews. The data are weighted to 
permit estimates of the entire population. The NHES survey for a given year typically consists of a 
Screener, which collects household composition and demographic data, and extended interviews on two 
substantive components addressing education-related topics. In order to assess data item reliability and 
inform future NHES surveys, each administration also includes a subsample of respondents for a 
reinterview. 

The primary purpose of the NHES is to conduct repeated measurements of the same 
phenomena at different points in time. Throughout its history, the NHES has collected data in ways 
that permit estimates to be tracked across time. This includes repeating topical components on a 
rotating basis in order to provide comparative data across survey years. In addition, each 
administration of the NHES has benefited firom experiences with previous cycles, resulting in 
enhancements to the survey procedures and content. Thus, while the survey affords the opportunity for 
tracking phenomena across time, it is also dynamic in addressing new issues and including conceptual 
and methodological refinements. 

A new design feature of the NHES program to be implemented in the NHES: 96 is the 
collection of demographic and educational information on members of all screened households, rather 
than just those households potentially eligible for a topical component. In addition, this expanded 
screening feature will include a brief set of questions on an issue of interest to education program 
administrators or policymakers. The total Screener sample size was sufficient to produce state 
estimates of household characteristics for the NHES: 96. 

The NHES has been conducted in 1991, 1993, 1995, and 1996. Topics addressed by the 
NHES:91 were early childhood education and adult education. The NHES: 93 collected information 
about school readiness and school safety and discipline. The 1991 components were repeated for the 
NHES: 95, addressing early childhood program participation and adult education. Both components 
underwent substantial redesign to incorporate new issues and develop new measurement approaches. 
In the NHES: 96, the topical components were parent/family involvement in education and civic 
involvement. The NHES:96 expanded screening feature included a set of questions on public library 
use. 

In addition to its topical components, the NHES system has also included a number of 
methodological investigations. These have resulted in technical reports and working papers covering 
diverse topics such as telephone undercoverage bias, proxy reporting, and sampling methods. This 



series of technical reports and working papers provides valuable information on ways of improving the 
NHES and other surveys. 

This working paper addresses selected data quality activities implemented in the NHES:93. 
Readers interested in other aspects of the NHES:93 may wish to review the user's manuals noted 
above, as well as other working papers. The NHES :93 working papers include Design, Data 
Collection, Monitoring, Interview Administration Time, and Data Editing in the 1993 National 
Household Education Survey (Brick et al. forthcoming), Unit and Item Response, Weighting, and 
Imputation Procedures in the 1993 National Household Education Survey (Brick et al. forthcoming), 
and Comparison of Estimates from the 1993 National Household Education Survey (Collins et al. 
forthcoming). In addition, a forthcoming technical report. Reinterviews in the 1993 National 
Household Education Survey (Brick et al. forthcoming), presents results of a reinterview test conducted 
with NHES: 93 respondents. 

NHES:93 Design 

The 1993 National Household Education Survey (NHES:93) addressed readiness for school and 
safety and discipline in school. These topics are related to Goal 1 and Goal 6, two of the National 
Education Goals. Specifically, Goal 1 states that "By the year 2000, all children in America will start 
school ready to learn." Goal 6 states that "By the year 2000, every school in America will be free of 
drugs and violence and will offer a safe, disciplined environment conducive to learning." 

The School Readiness (SR) component covered experience in early childhood programs, the 
child's accomplishments and difficulties in several developmental domains, school adjustment and 
related problems, delayed kindergarten entry, and early primary school experiences, including 
repeating grades, the child's general health and nutritional status, home activities, and family 
characteristics such as stability and economic risk factors. Altogether, 10,888 children aged 3 through 
7 or in 2nd grade or below were sampled. Interviews were conducted with 4,423 parents of preschool 
children, 2,126 parents of kindergartners, 4,277 parents of primary school children, and 62 parents of 
home school children. For further information on the content of the SR component, see the School 
Readiness Data File User's Manual (Brick et al. 1994). 

The School Safety and Discipline component (SS&D) focused on four areas: school 
environment, school safety, school discipline policy, and alcohol/other drug use and education. The 
SS&D interview gathered general perceptions of the school learning environment from both parents and 
students. Parents of 12,680 children in 3rd through 12th grades were interviewed, as were 6,504 
students in 6th through 12th grades. For further information on the content of the SS&D component, 
see the School Safety and Discipline Data File User's Manual (Brick et al. 1994). 

The NHES: 93 was developed to provide reliable estimates for each of the two different 
components described above. The inclusion of two survey components made the overall survey more 
cost effective, thus allowing for larger sample sizes and more precise estimates. This strategy was key 
to the NHES design. By including more than one topic within the framework of a single survey, the 
cost of screening household to find those eligible for the study could be partitioned over the component 
surveys. 



It was possible that the same household member could be selected to respond to more than one 
interview and/or that more than one household member could be sampled. For the SR interview, if 
there were one or two eligible children in the household, interviews were conducted for those children. 
If the household included more than two eligible children, two children were randomly sampled from 
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that household. For the SS&D interview, if a household had one eligible youth, that youth was selected 
with a probability that depended on his/her grade (students in grades 3 through 5 were selected with a 
lower probability than those in grades 6 through 12). If a household had two or more eligible youths, 
the sampling depended upon the number of youths in the household in each of the two grade categories. 
A maximum of two youths was selected from any household for the SS&D component, one from the 
lower grades and one from the upper grades. 

Even though sampling methods reduced the nmnber of interviews per household, the length of 
the interview was considered to be a critical factor in obtaining high response rates and reliable 
estimates. Therefore, the nmnber of items included in the NHES:93 was limited in order to help 
improve response rates and reduce the demands made on survey respondents. 

Because of the above requirements, complex sampling techniques, and the need for quick and 
accurate administration, the NHES:93 was conducted using computer assisted telephone interviewing 
(CATI) technology. Some of the advantages of CATI for the NHES:93 included improved project 
administration, online sampling and eligibility checks, scheduling of interviews according to a priority 
scheme to improve response rates, managing data quality by controlling skip patterns and checking 
responses online for range and consistency, and an online "help" function to answer interviewers’ 
questions. 

Three different interview instruments were used in the NHES:93. These instruments were the 
Screener, the SR interview, and the SS&D interview. Items within each of the three instruments were 
programmed so that the appropriate items appeared on the interviewer's computer screen 
corresponding to the respondent's answer to previous queries. These instruments are discussed in 
detail in the School Readiness Data File User's Manual and the School Safety and Discipline Data File 
User's Manual. 



2. Telephone Coverage Bias in the NHES:93 



This section addresses issues associated with the bias that might arise in estimates from the 1993 
National Household Education Survey (NHES:93) because only households with telephones were 
sampled. Data from the 1992 October supplement to the Current Population Survey (CPS) are used to 
evaluate the size of the bias. The focus of this section is on the potential for bias in statistics for 3- to 7- 
year-olds corresponding to the School Readiness (SR) component population in the NHES:93. Estimates 
of coverage bias for the School Safety & Discipline (SS&D) component population are not presented. 
Because students had to be enrolled in order to be eligible for the survey, comparisons of enrollment 
status for this population are not useful. In addition, other measures of interest, such as victimization at 
school, were not available from sources that had telephone coverage information available. 

This analysis continues research on telephone coverage bias in estimates from the NHES that 
began with the 1989 Field Test (Brick, Burke, and West 1992). Other research was conducted using the 
data from the NHES:91 (Brick 1992). The procedures used in this analysis are consistent with the 
methods used to estimate the coverage bias in estimates of characteristics of those studies. Tables from 
the NHES:91 that were based on data from the 1990 October supplement to the CPS are provided at the 
end of Section 2. 



Telephone Coverage and Bias 

The NHES:93 was a random-digit-dial telephone survey and thus included only persons who lived 
in households with telephones. Approximately 6 percent of all persons live in households without 
telephones according to data from the March 1992 CPS. The Bureau of the Census used data from the 
CPS to estimate the trend in telephone coverage of adults (persons 16 years and older). Figure 2-1 shows 
there has been a slight increase in the percentage of adults in telephone households from 1988 to 1992. 

The percentage of adults in households with telephones varies somewhat by the characteristics of 
the populations being considered. Figure 2-2 shows telephone coverage by race/ethnicity. White adults 
have a coverage rate of approximately 96 percent, which is slightly above the 95 percent for all adults. 
Black and Hispanic adults have lower coverage rates. 

The inference population for the NHES includes persons living in both telephone and 
nontelephone households. Since the survey only interviews persons in telephone households and yet 
makes inference to persons living in both telephone and nontelephone households, the question of bias in 
the estimates naturally arises. 

Bias has a specific technical definition in this context. Bias refers to the expected difference 
between the estimates from the survey and the actual population value. For example, if all telephone 
households were included in the survey and responded to the required interviews, the difference between 
the estimate from the survey and the actual population value (which includes the responses of persons 
living in nontelephone households) is the bias due to incomplete coverage. Since the NHES is based on 
a sample, the bias is defined as the expected or average value of this difference over all possible samples. 



Figure 2-1.- Telephone coverage of adults from 1988 to 1992 




SOURCE: Special tabulations prepared by Bureau of Census from the 1988 through 1992 Current 
Population Surveys. Average coverage based on March, July, and November. Includes adults 16 
years of age and older. 



Figure 2-2.~ Telephone coverage of adults in 1992 by race/ethnicity 




Total White Black Hispanic 

Race/Ethnicity 



SOURCE: Special tabulations prepared by Bureau of Census from the 1992 Current Population 
Surveys. Average coverage based on March, July, and November. Includes adults 16 years of age 
and older. 



Bias due to coverage problems can be substantial when two conditions are satisfied. First, the 
differences between the characteristics in the covered population and the uncovered population must be 
relatively large. For example, consider estimating the percentage of persons enrolled in a program. If 
the percentage enrolled is nearly identical in both the covered and uncovered population, then the bias 
for this estimate will be negligible. Second, the proportion of the population that is not covered by the 
survey must be large compared to the size of the estimates. If only 2 percent of the population is not 
covered, estimates of totals that comprise 20 or 30 percent of the population will not be greatly affected, 
even if the differences in the characteristics between the covered and uncovered populations are 
relatively large. 

It is important to realize that the second condition requires the proportion uncovered must be large 
relative to the size of the estimates. If the estimate under consideration is for a domain or subgroup that 
is small, then even a small coverage problem can result in important biases in the estimates of the 
domain. For example, previous research in NHES showed that although only a small percentage of all 
14- to 18-year-olds are school dropouts, there is considerable concentration of dropouts in nontelephone 
households. Consequently, there are substantial biases in estimates of dropouts although the biases are 
generally quite small for other statistics on 14- to 18-year-olds. 

Mathematically, the bias can be written as 
Bias(y,) = (1) 



where is the estimated characteristic based on the telephone households only, Pn is the proportion of 
nontelephone households, j)„ is the estimated characteristic based on the nontelephone households, and 
E is the expectation operator for averaging over all possible samples. 

This expression shows that the bias in the estimates increases as the proportion of households 
without telephones increases. Thus, the percentage of households without telephones, P„, is an 
important component in assessing the size of the bias. The population of interest in the School Readiness 
(SR) component was 3- to 7-year-olds' who live in nontelephone households and is estimated at about 
9.5 percent, based on the October 1992 CPS. This figure is higher than the 6 percent of all persons who 
live in nontelephone households, suggesting that bias could be a more significant problem for this 
domain than for estimates relating to the total population. 

Estimated Differences Between Telephone and Nontelephone Households 

The other component in the bias formula is the difference in estimates of telephone and 
nontelephone households. For many statistics there are major differences between telephone and 
nontelephone households. For example, there is a strong relationship between having a telephone and 
income and one’s socioeconomic status and lifestyle. Thomberry and Massey (1988) assessed 
noncoverage bias of estimates of health characteristics and found many health and health-related 
characteristics of persons in nontelephone households were different from those of persons in telephone 
households. Brick, Burke, and West (1992) studied estimates for education statistics. They found 
smaller differences between telephone and nontelephone households for enrollment statistics than for 
other characteristics. 



^This group is defined as children 3- to 7-years-old regardless of their grade and children 3 years and older who are not yet in 
the 3rd grade. 



To examine the extent of the differences in the characteristics of persons in telephone and 
nontelephone households, the CPS, which is a household survey done both door to door and by 
telephone, was used as a data source. The October 1992 CPS contained two sets of items relevant to the 
NHES:93 SR component. One set of questions asked about the child having disabling conditions; the 
other asked about enrollment in school. The NHESi91 Early Childhood Education component on 3- to 
5-year-old children, which was mentioned earlier, used the October 1990 CPS as the data source (Brick 

1992) . Tables from the NHES:91 appear in tables 2-5 through 2-8. The questions in that survey asked 
about the frequency of certain activities that a family member might have done with the child in the past 
week, month, or year. 

Tabulations were made of the percentage distributions for the October 1992 CPS items for the 
population of children aged 3 to 7 years old (table 2-1). The percentage distributions for telephone and 
nontelephone households^are shown separately in the table. 

The percentage distributions reveal some differences between the two estimates. Many of the 
differences are small. For example, the disability estimates are very similar for telephone households 
and nontelephone households. This may be because disabling conditions are not correlated with 
socioeconomic status. In contrast, enrollment in public versus private school and repeating a grade are 
more likely to be associated with socioeconomic status (McLaughlin et al., 1995; Collins and Brick, 

1993) . The differences between telephone and nontelephone household estimates are greater for these 
items and the bias is therefore likely to be larger for these characteristics. 



Statistical Adjustments of the Estimates 

In the NHES, the standard practice is to make statistical adjustments of survey estimates that 
compensate, to the extent possible, for design problems. This practice is especially important for surveys 
in which there is the potential for bias from undercoverage. The adjustments include ordinary 
nonresponse adjustments and the adjustments to known control totals. Adjustments to control totals are 
typically performed using poststratification or raking. 

One of the goals of adjusting to control totals is to make the estimates consistent with known 
totals, but often a more important goal is to reduce the impact of imperfections in the design and conduct 
of the study on the estimates. In telephone surveys, these adjustments are designed to partially correct 
for undercoverage bias. 

For the NHES:93, three dimensions of raking were used. The first dimension employed variables 
that indicated the Census region in which the person lived and whether or not the home was owned/other 
or rented. The second dimension was a combination of race/ethnicity and family income. The third 
dimension was age and grade. Based on these dimensions, the sample weights were raked to be 
consistent with the marginal control totals from the October 1992 CPS. 

When sample weights are poststratified, a poststratified adjustment factor can be defined as the 
ratio of the poststratum control count for a cell and the sum of the weights of all the cases in that cell. 
The final, poststratified weight is the sampling weight multiplied by the adjustment factor for a cell. 



^The classification of a household by telephone status was based on the response to the item, "Is there a telephone in this 
house/apartment?" This question was asked in the July and November CPS and Census Bureau staff inserted the reported 
value on the October file for this analysis. 
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Raking adjustments can be defined in a similar way by thinking of raking as a multidimensional form of 
poststratification. Because more than one dimension is involved in raking, the adjustment of the weights 
is iterated over each of the dimensions until the sum of the adjusted weights equals the marginal control 
counts for all the raking dimensions, within specified tolerances. The raking adjustment factor can be 
defined as the ratio of the sum of the adjusted weights in a cell divided by the sum of the weights in that 
cell prior to adjustment. 

The coverage bias in the estimates is the residual bias that is present after the weight adjustments 
have been made. To evaluate the effectiveness of these adjustments for reducing the bias from coverage, 
relative raking adjustment factors from the NHES:93 SR component were computed. Relative 
adjustment factors were created by dividing the average raking factor for a specific cell in a raking 
dimension by the average factor across all cases. 

The formal definition of the relative raking adjustment factor requires introducing some 
terminology. The raking factor for person / in cell c is denoted as where c ={j® k®l) and j, k, 
and 1 refer to the three raking dimensions used in the NHES:93. Further, the set Cy = {c. j = j'} is 

defined as the cells where the first dimension of the raking variables is always j' . The sets for the other 
dimensions are and C/ . 



Now, the average raking factor across all cases can be written as: 

A = ^ , (2) 

iec 

where Aj ^ is the raking factor for person i in cell c and wj is the weight for person i prior to raking. The 
average raking factor for a specific value (say y ^ on dimension j is given by: 



- iec 

Af = — h 



• ( 3 ) 



le: 



f 



Finally, the relative raking adjustment factor is the ratio of these two quantities: 



RAi> = . 



( 4 ) 



For example, the average raking adjustment factor for all 3- to 7-year-olds who were Hispanic and 
lived in a household with a family income of less than $10,000 per year was computed. This average 
adjustment factor was divided by the average adjustment factor for all 3- to 7-year-olds to create the 
relative adjustment factor for this subgroup. The relative factors for nine categories of race/ethnicity and 
family income are shown in table 2-2. 

The relative raking adjustment factors are greater than unity (indicating that 3- to 7-year-old 
children in this group are adjusted upward relative to the average across all groups) for all the lowest 
income groups, regardless of race/ethnicity. This, of course, adjusts for the lower telephone penetration 



in the low-income group. The relative adjustment for blacks in households with incomes of less than 
$10,000 is larger than for any other group, which is consistent with the low telephone coverage for 
blacks as shown in figure 2-2. The adjustments for families with incomes of $25,000 or more are the 
smallest of the groups summarized, reflecting their relatively high telephone coverage. The relative 
adjustment factors are generally lower than the comparable factors from the NHES:91 study shown in 
table 2-6. The factors for Hispanics, in particular, are lower for this survey than the NHES:91. No 
specific explanation is available for this result, however, some of the differences may be due to the fact 
that the 1992 CPS estimates used for developing the raking factors for NHES:93 were based on 1990 
Census data, while the 1990 CPS estimates used for developing the raking factors for NHES:91 were not 
yet adjusted for the 1990 Census, but for the 1980 Census data. 

The nine categories are used to illustrate the impact of the adjustments on the estimates. These 
factors do not include all the variability in the adjustments used in weighting the data for NHES:93. For 
example, the NHES:93 adjustments differed by age, but the factors in the table are collapsed across all 
ages. Factors across other categories could also have been selected. The income variable was 
considered important due to the high correlation between family income and telephone status. In the 
actual NHES:93 estimation process, the full set of adjustments was used rather than the adjustments 
shown in table 2-2. 



Estimates of Coverage Bias 

The relative adjustment factors presented above were used to simulate the impact of the raking 
adjustment on the estimated percentage distributions in the NHES:93 SR component. The factors were 
applied to October 1992 CPS estimates of characteristics of 3- to 7-year-olds living in telephone 
households to produce estimated percentage distributions for all 3- to 7-year-olds. In this way, the 
telephone households from the October 1992 CPS are used to simulate the impact of the adjustments on 
the estimates in a telephone survey (table 2-3). For comparison purposes, the estimates based on all CPS 
households and the biases associated with the estimates before and after adjustment are also shown in 
this table. A negative bias indicates that the sample estimate is smaller than the estimate based on all 
households. 

The comparison of the estimates from persons living in all households to the adjusted estimates 
based on those only in telephone households shows that the adjustments decreased the bias in some 
cases, slightly increased the bias in others, and did not affect the bias in other estimates. In almost all 
circumstances, the estimated biases are not statistically significant from zero. 

Even if the adjustments did not correct for the differential undercoverage bias, the estimates based 
only on respondents in telephone households might not be as misleading as the data in table 2-1 indicate. 
When the differences between estimates from telephone and nontelephone households are not 

very large and the proportion of nontelephone households (P«) is small, the biases are not large. The 
unadjusted estimates from telephone households are slightly more biased than those based on the raking 
adjustment, but they are not wildly different from the actual estimates as shown in table 2-3. The reason 
is simple: Less than 10 percent of 3- to 7-year-olds live in nontelephone households, and this limits the 
bias that can be incurred from this source. 

The bias for subgroups may be affected differently than that for aggregates across all groups. The 
main reason is that the proportion of households without telephones is larger for some subgroups than the 
proportion for the population as a whole. For example, while only about 10 percent of all 3- to 7-year- 



olds are in nontelephone households, for Hispanics, non-Hispanic blacks, and non-Hispanic nonblacks 
the percents of 3- to 7-year-olds in nontelephone households are 17, 23, and 5, respectively. Thus, the 
potential for bias is much greater for estimates of Hispanics and blacks. It should also be noted that the 
difference in coverage rates by race/ethnicity can create biases in estimates of the total population even if 
the characteristics of telephone and nontelephone households are identical. This occurs because the 
race/ethnicity mix of a telephone sample may differ from the distribution of the total population, and this 
can create biases for characteristics that vary among the three major race/ethnicity groups. 

To examine the potential for bias in these subgroups more closely, table 2-4 shows the estimates 
for all households, for adjusted telephone households, and the associated biases by race/ethnicity of the 
child. The estimated differences by race/ethnicity are larger than the aggregates across the entire 
population. These are not negligible, but they are still less than what would have been observed if no 
adjustments for undercoverage had been made. 

Two reasons account for the apparent larger biases for the subgroup estimates. First, the relative 
adjustment factors used in this simulation include cells for race/ethnicity. The only adjustment factor 
operating within the race/ethnicity cells is associated with family income. Therefore, the bias 
adjustments are smaller within these subgroups, and the use of the relative adjustment factors is likely to 
depress the bias reducing properties for these subgroups. The relative adjustment factors are not as 
variable within a race/ethnicity cell as they are over all cells, and the ability to mitigate the biases within 
these cells is limited. In the actual application of the raking adjustments in NHES:93, the full adjustment 
factors were used and a greater opportunity to reduce biases exists. 

The second reason for the apparently larger differences relates to the precision of the estimated 
differences. The difference between the estimate for the . adjusted telephone households and all 
households is the estimated bias. The estimated bias has a relatively large sampling error.^ It is difficult 
to assess the estimated differences or biases for subgroups, since the sampling errors on these statistics 
are so large that none of them is significantly different from zero. 

These two points relate back to the main reasons for adjusting the estimates. The adjustments are 
made with the hope that persons within the adjustment cells are homogeneous with respect to the 
characteristics being estimated. When this is true, the adjustments will tend to decrease the bias. Within 
the adjustment cells, undercoverage biases may persist if persons in telephone and nontelephone 
households have substantial differences in characteristics. Unfortunately, the databases available do not 
have sample sizes large enough to examine these differences very well. 



^Technically, the estimated bias is the difference between the estimated total from the telephone households with a revised 
weight and the estimated total from the nontelephone households. The revised weight is the differential sampling weight for 

the case multiplied by a complex factor that can be written as: ^ )j , where A is the average adjustment 

factor to make the sum of the telephone household weights equal to the national total and a” is the average adjustment 
factor to make the sum of the weights for all sampled households equal to the national total. This estimate could be negative 
if the raking adjustment increases the bias for a particular characteristic beyond what would result if no adjustments were 
made to the weights. 

Since the estimated bias is the difference of two independent components, its variance is the sum of the variances for the 
components. The variance for the estimated total for the nontelephone households is relatively large, especially for 
subgroups. There were only 1,011 3- to 7-year-olds in nontelephone households in the October 1992 CPS with 207 
Hispanics, 336 non-Hispanic blacks, and 468 non-Hispanic nonblacks. Estimates based on samples of this size from the CPS 
generally have sampling errors between 2 and 30 percent of the size of the estimates, depending on the subgroups. Even 
without evaluating the variance of the second term, it is clear that the sampling error of the estimated bias is large. 
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For the statistics computed for the NHES:91 Early Childhood Education component, the 
adjustments were very effective in reducing bias. The adjustments made to the estimates of telephone 
households virtually eliminated the coverage bias as shown in table 2-7. This result contrasts with the 
findings from the current research. The biases for race/ethnicity subgroups were larger than those across 
the total population in the NHES:91 (table 2-8), paralleling the results from the current study. 



Conclusions 

The analysis of undercoverage bias shows that the coverage bias for statistics on 3- to 7-year-olds 
in the NHES;93 is not large. This finding is true even though large differences are reported for children 
living in telephone and nontelephone households. The estimates were adjusted using variables correlated 
with the presence of a telephone in the household. For some estimates, the adjustment reduced the bias 
from undercoverage. For others, the adjustment did not affect the bias or slightly increased the bias. 

In large sample surveys like the NHES:93, nonsampling error is often the source of much larger 
errors than arise from sampling. Coverage is an important source of nonsampling error, and it is 
important to review the potential of bias from this source. This analysis reveals that for many types of 
aggregates the residual bias associated with undercoverage is not a major problem. 

As noted above, the undercoverage bias for smaller subgroups could be more problematic and 
require additional research. The undercoverage bias for most subgroups is not likely to be a major 
problem after adjustment. However, the potential for bias is greatest for those subgroups in which a 
large proportion live in nontelephone households. These findings suggest that additional analysis of the 
undercoverage for the SR component is not necessary, unless some specific subgroup that is likely to 
have much poorer than average coverage is the subject of a detailed analysis. 

No general rule adequately addresses all the subgroups that may be analyzed. When dealing with 
a small subgroup that is likely to be differentially undercovered, data users should consider the possible 
impact of different sources of error. Both sampling errors and nonsampling errors from coverage bias 
are likely to be relatively large for such rare groups. 

Despite the complications for rare subgroups that have low telephone coverage rates, the 
usefulness of the statistical adjustments and the low residual undercoverage bias for most statistics 
indicate that telephone data collection is a very cost-effective survey procedure for the populations 
studied in NHES:93. When evaluating the residual bias in the rarer subgroups, it should be recognized 
that the sample size for an in-person interview survey at the same cost would be much smaller than is 
possible in a telephone survey, and estimates for these subgroups would be subject to very large 
sampling errors. For most items, the telephone survey approach provides more information for estimates 
of the subgroups than would be possible for an in-person interview at the same cost. 
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Table 2-1.- Estimated percentage of 3- to 7-year-olds in telephone and nontelephone households who 
have specific characteristics 



Child characteristic 


Children in 
telephone 
households 


Children in 
nontelephone 
households 


Attends or enrolled in regular school' 


76.0% 


67.8% 


Enrolled in public school 


76.9 


92.3 


Enrolled in grade:^ 






Nursery school— full time 


.6.1 


4.3 


Nursery school— part time 


13.8 


8.7 


Kindergarten— full time 


11.5 


15.2 


Kindergarten— part time 


15.4 


15.0 


1st grade 


25.3 


30.0 


2nd grade 


26.5 


26.4 


3rd grade 


1.3 


0.5 


Repeated a grade^ 


2.0 


6.9 


Had disabling condition:’ 






Learning disability 


2.0 


2.0 


Mental retardation 


0.8 


0.4 


Speech impairment 


3.2 


3.6 


Serious emotional disturbance 


0.7 


0.7 


Deafness 


0.5 


0.4 


Other hearing impairment 


1.3 


1.4 


Blindness 


0.3 


0.3 


Other vision impairment 


0.9 


1.2 


Orthopedic impairment 


.0.9 


0.8 


Other health imp. lasting 6 months or more 


1.6 


1.6 


None of the above 


89.2 


87.3 



2 Estimates are based on all children (10,997 in all and 1,01 1 in nontelephone households). 

Estimates are based on enrolled children (8,353 in all and 695 in nontelephone households). 
NOTE: Due to rounding, details may not add to totals. 

SOURCE: Special tabulations from the October 1992 Current Population Survey. 



Table 2-2.— Relative raking adjustment factors for NHES:93 School Readiness component, by 
race/ethnicity and family income 



Race/ethnicity 


Family income 


Relative factor 


Hispanic 


Less than $ 1 0,000 


1.30 


Hispanic 


$10,000 to $24,999 


0.90 


Hispanic 


$25,000 or more 


0.49 


Black, non-Hispanic 


Less than $10,000 


1.97 


Black, non-Hispanic 


$10,000 to $24,999 


1.57 


Black, non-Hispanic 


$25,000 or more 


0.74 


Non-Hispanic, nonblack 


Less than $10,000 


1.35 


Non-Hispanic, nonblack 


$10,000 to $24,999 


1.12 


Non-Hispanic, nonblack 


$25,000 or more 


0.74 



SOURCE; U.S. Department of Education, National Center for Education Statistics, National Household Education Survey 
(NHES), spring 1993. 



Table 2-3.— Estimated percentage of 3- to 7-year-olds in all households who have specific 
characteristics, adjusted estimates based on raking only children in telephone households, 
and the bias of the estimates before and after adjustment 



Child Characteristic 


Children in all 
households 


Bias in telephone 
household estimates 


Adjusted telephone 
households 


Bias in adjusted 
telephone 
household 
estimates | 


Attends or is enrolled in regular school^ 


75.0% 


1.0% 


75.4% 


0.4% 


Enrolled in public schoof 


78.4 


-1.5 


80.2 


1.8 


Enrolled in grade^ 










Nursery school-full time 


6.0 


0.1 


6.4 


0.4 t 


Nursery school— part time 


13.2 


0.5 


12.9 


-0.3 


Kindergarten— full time 


12.0 


-0.5 


12.0 


0.0 


Kindergarten— part time 


15.4 


0.0 


15.2 


-0.2 


1st grade 


25.8 


-0.4 


25.9 


0.1 


2nd grade 


26.3 


0.2 


26.4 


0.2 


3rd grade 


1.3 


0.0 


1.2 


-0.1 i 


Repeated a grade^ 


2.7 


-0.5 


2.5 


-0.1 


Had disabling condition^ 










Learning disability 


2.1 


-0.1 


2.3 


0.3 


Mental retardation 


0.7 


0.1 


0.9 


0.2 . 


Speech impairment 


3.3 


-0.1 


3.5 


0.2 * 


Serious emotional disturbance 


0.7 


0.0 


0.8 


0.1 


Deafness 


0.5 


0.0 


0.5 


0.0 


Other hearing impairment 


1.3 


0.0 


1.5 


0.1 


Blindness 


0.3 


0.0 


0.4 


0.0 


Other vision impairment 


1.0 


-0.1 


1.0 


0.0 . 


Orthopedic impairment 


0.9 


0.0 


1.0 


0.1 ^ 


Other health imp. lasting 6 months 


1.6 


0.0 


1.8 


0.2 


or more 










None of the above 


88.8 


0.4 


88.8 


0.0 



1 Estimates are based on all children (10,997 in all and 9,986 in telephone households). 

2 Estimates are based on enrolled children (8,353 in all and 7,658 in telephone households). 

NOTE: Due to rounding, details may not add to totals. A negative bias indicates that the sample estimate is smaller than the 
estimate based on all households. 

SOURCE: Special tabulations from the October 1992 Current Population Survey. 
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Table 2 - 4 .— Estimated percentage of 3- to 7-year-olds in all households who have specific 
characteristics, adjusted estimates based on raking only children in telephone households, 
and the bias of the adjusted estimates, by race/ethnicity 






$ 






i 



ft 



ft 



ft 



ft 



Child characteristic 


Children 
in all 

households 


Adjusted 

telephone 

households 


Bias in adjusted 
telephone 
household 
estimates 


Attends or is enrolled in regular school^ 








All 


75.0% 


75.4% 


0.4% 


Hispanic 


69.8 


71.6 


1.8 


Black, non-Hispanic 


74.5 


75.8 


1.3 


Nonblack, non-Hispanic 


76.0 


75.8 


-0.2 


Enrolled in public school^ 








All 


78.4 


80.2 


1.8 


Hispanic 


89.7 


91.8 


2.1 


Black, non-Hispanic 


89.3 


91.6 


2.3 


Nonblack, non-Hispanic 


74.4 


75.5 


1.1 


Enrolled in grade^ 








All 








Nursery school-full time 


6.0 


6.4 


0.4 


Nursery school— part time 


13.2 


12.9 


-0.3 


Kindergarten— full time 


12.0 


12.0 


0.0 


Kindergarten— part time 


15.4 


15.2 


-0.2 


1st grade 


25.8 


25.9 


0.1 


2nd grade 


26.3 


26.4 


0.2 


3rd grade 


1.3 


1.2 


-0.1 


Hispanic 








Nursery school— full time 


3.7 


3.5 


-0.2 


Nursery school-part time 


7.9 


8.5 


0.5 


Kindergarten-full time 


13.0 


12.1 


-0.8 


Kindergarten-part time 


18.3 


19.8 


1.5 


1st grade 


28.5 


27.3 


-1.2 


2nd grade 


26.5 


26.6 


0.0 


3rd grade 


2.0 


2.3 


0.3 


Black, non-Hispanic 








Nursery school— full time 


10.5 


10.5 


0.0 


Nursery school— part time 


5.9 


6.2 


0.3 


Kindergarten— full time 


19.5 


18.7 


-0.8 


Kindergarten— part time 


10.3 


9.6 


-0.7 


1st grade 


25.3 


26.6 


1.3 


2nd grade 


27.0 


27.4 


0.4 


3rd grade 


1.6 


1.1 


-0.5 
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Table 2-4.— Estimated percentage of 3- to 7-year-olds in all households who have specific 
characteristics, adjusted estimates based on raking only children in telephone households, 
and the bias of the adjusted estimates, by race/ethnicity— Continued 



Child characteristic 


Children 
in all 

households 


Adjusted 

telephone 

households 


Bias in adjusted 
telephone 
household 
estimates 


Enrolled in grade 
Nonblack, non-Hispanic 








Nursery school— full time 


5.4% 


5.6% 


0.2% 


Nursery school— part time 


15.6 


15.4 


-0.2 


Kindergarten— full time 


10.2 


10.1 


-0.1 


Kindergarten— part time 


16.1 


16.2 


0.1 


1st grade 


25.5 


25.5 


0.1 


2nd grade 


26.1 


26.2 


0.1 


3rd grade 


1.2 


1.1 


-0.1 


Repeated a grade^ 








All 


2.7 


2.5 


-0.1 


Hispanic 


2.6 


2.1 


-0.5 


Black, non-Hispanic 


4.5 


4.3 


-0.2 


Nonblack, non-Hispanic 


2.3 


2.1 


-0.2 


Had disabling conditions^ 
All 








Learning disability 


2.1 


2.3 


0.2 


Mental retardation 


0.7 


0.9 


0.2 


Speech impairment 


3.3 


3.5 


0.2 


Serious emotional disturbance 


0.7 


0.8 


0.1 


Deafness 


0.5 


0.5 


0.0 


Other hearing impairment 


1.3 


1.5 


0.1 


Blindness 


0.3 


0.4 


0.0 


Other vision impairment 


1.0 


1.0 


0.0 


Orthopedic impairment 


0.9 


1.0 


0.1 


Other health imp. lasting 6 months or 








more 


1.6 


1.8 


0.2 


None of the above 


88.8 


88.8 


0.0 
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Table 2-4.— Estimated percentage of 3- to 7-year-olds in all households who have specific 
characteristics, adjusted estimates based on raking only children in telephone households, 
and the bias of the adjusted estimates, by race/ethnicity— Continued 



Child characteristic 


Children 
in all 

households 


Adjusted 

telephone 

households 


Bias in adjusted 
telephone 
household 
estimates 


Hispanic 








Learning disability 


0.9% 


1.0% 


0.1% 


Mental retardation 


0.6 


0.9 


0.3 


Speech impairment 


1.3 


1.1 


-0.2 


Serious emotional disturbance 


0.2 


0.3 


0.1 


Deafhess 


0.3 


0.4 


0.2 


Other hearing impairment 


0.2 


0.3 


0.1 


Blindness 


0.1 


0.1 


0.1 


Other vision impairment 


0.5 


0.5 


-0.1 


Orthopedic impairment 


0.4 


0.8 


0.3 


Other health imp. lasting 6 months or more 


1.2 


1.5 


0.3 


None of the above 


89.5 


89.3 


-0.1 


Black, non-Hispanic 








Learning disability 


2.5 


3.4 


1.0 


Mental retardation 


0.9 


1.4 


0.5 


Speech impairment 


3.4 


5.0 


1.6 


Serious emotional disturbance 


0.8 


1.3 


0.4 


Deafhess 


0.4 


0.5 


0.1 


Other hearing impairment 


1.3 


1.8 


0.5 


Blindness 


0.5 


0.5 


0.1 


Other vision impairment 


1.3 


1.4 


0.1 


Orthopedic impairment 


1.1 


1.0 


0.0 


Other health imp. lasting 6 months or more 


1.8 


2.4 


0.6 


None of the above 


87.0 


85.8 


-1.2 


Nonblack, non-Hispanic 








Learning disability 


2.2 


2.2 


0.0 


Mental retardation 


0.7 


0.8 


0.1 


Speech impairment 


3.6 


3.4 


-0.1 


Serious emotional disturbance 


0.8 


0.8 


0.0 


Deafhess 


0.5 


0.5 


0.0 


Other hearing impairment 


1.5 


1.5 


0.0 


Blindness 


0.4 


0.3 


0.0 


Other vision impairment 


1.0 


1.0 


0.0 


Orthopedic impairment 


1.0 


1.0 


0.0 


Other health imp. lasting 6 months or more 


1.7 


1.7 


0.0 


None of the above 


89.1 


89.5 


0.5 



^ Estimate is based on all children (10,997 in all and 9,986 in telephone households; for black children 1,528 in all and 1,192 in 
telephone households; for Hispanic children 1,1 18 in all and 91 1 in telephone households). 

^ Estimate is based on enrolled children (for all children 8,353 in all and 7,658 in telephone households; for Hispanic children 
792 in all and 657 in telephone households). 

SOURCE: Special tabulations from the 1992 Current Population Survey. 



Table 2-5.-- Estimated percentage of 3- to 5-year-olds in telephone and nontelephone households who 
engaged in specific activities with family members 



Activities of 3- to 5-year- 
olds with family members 


Children in 
telephone households 


Children in 

nontelephone households 


Frequency 


Frequency 


None 


1 or 2 


3 or more 


None 


1 or 2 


3 or more 


Activity in the last week 














Read to 


7.0% 


23.6% 


69.4% 


21.4% 


41.7% 


36.9% 


Taught letters, words, 














numbers 


15.8 


27.7 


56.5 . 


30.5 


33.5 


36.1 


Taught songs or music 


32.2 


31.2 


36.6 


48.2 


27.9 


23.8 


Did arts and crafts 


35.3 


32.4 


32.3 


56.3 


26.9 


16.8 


Played games or sports 


13.7 


33.4 


52.9 


28.0 


36.8 


35.2 


Watched educational TV 


27.8 


25.2 


46.9 


39.8 


22.7 


37.5 




Within the last 


Within the last 




Month 


Year 


No 


Month 


Year 


No 


Activity in the last month/year 














Visited a library 


36.0% 


23.3% 


40.7% 


14.0% 


14.3% 


71.6% 


Gone to a movie 


28.3 


38.3 


33.4 


21.9 


24.3 


53.8 


Gone to a play/concert/live 














show 


11.3 


27.7 


61.1 


7.1 


8.5 


84.3 


Visited art gallery, etc. 


13.1 


33.8 


53.1 


7.4 


10.1 


82.5 


Visited zoo/aquarium 


17.0 


51.3 


31.6 


8.3 


27.2 


64.5 


Visited playground/park 


75.1 


18.1 


6.8 


66.0 


18.6 


15.4 



NOTE: Due to rounding, details may not add to totals. 

SOURCE: Special tabulations from the October 1990 Current Population Survey. 
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Table 2-6.-- Relative raking adjustment factors for NHES:91 Early Childhood Education component, by 
race/ethnicity and family income 



Race/ethnicity 


Family income 


Relative factor 


Hispanic 


Less than $10,000 


2.07 


Hispanic 


$10,000 to $24,999 


1.18 


Hispanic 


$25,000 or more 


0.87 


Black, non-Hispanic 


Less than $10,000 


2.67 


Black, non-Hispanic 


$10,000 to $24,999 


1.41 


Black, non-Hispanic 


$25,000 or more 


1.13 


Non-Hispanic, nonblack 


Less than $10,000 


1.45 


Non-Hispanic, nonblack 


$10,000 to $24,999 


0.99 


Non-Hispanic, nonblack 


$25,000 or more 


0.79 



SOURCE: U.S. Department of Education, National Center for Education Statistics, National Household Education Survey 
(NHES), 1991. 



Table 2-7.— Estimated percentage of 3- to 5-year-olds in all households who engaged in specific 
activities with family members and adjusted estimate based on raking only 3- to 5-year-olds 
in telephone households 



Activities of 3- to 5-year- 
olds with family members 


Children in all households 


Adjusted telephone households 


Frequency 


Frequency 


None 


1 or 2 


3 or more 


None 


1 or 2 


3 or more 


Activity in the last week 














Read to 


8.5% 


25.5% 


65.9% 


8.9% 


24.7% 


66.4% 


Taught letters, words. 














numbers 


17.4 


28.3 


54.3 


16.5 


27.6 


55.9 


Taught songs or music 


33.9 


30.8 


35.3 


33.1 


30.9 


36.0 


Did arts and crafts 


37.6 


31.8 


30.6 


38.4 


30.9 


30.7 


Played games or sports 


15.3 


33.7 


51.0 


15.2 


33.6 


51.2 


Watched educational TV 


29.1 


25.0 


45.9 


28.5 


25.3 


46.2 




Within the last 


Within the last 




Month 


Year 


No 


Month 


Year 


No 


Activity in the last month/year 














Visited a library 


33.7% 


22.3% 


44.0% 


33.6% 


22.2% 


44.2% 


Gone to a movie 


27.6 


36.8 


35.6 


28.5 


36.1 


35.5 


Gone to a play/concert/ 














live show 


10.8 


25.6 


63.6 


11.1 


25.1 


63.8 


Visited art gallery, etc. 


12.5 


31.2 


56.3 


12.5 


30.9 


56.6 


Visited zoo/aquarium 


16.1 


48.8 


35.1 


16.9 


48.4 


34.7 


Visited playground/park 


74.2 


18.2 


7.7 


73.8 


18.4 


7.8 



NOTE: Due to rounding, details may not add to totals. 

SOURCE: Special tabulations from the October 1 990 Current Population Survey. 
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Table 2-8.— Estimated percentage of 3- to-5-year-olds in all households who engaged in specific 
activities with family members and adjusted estimate based on raking only 3- to 5-year-olds 
in telephone households, by race/ethnicity 



Activities of 3- to 5-year- 
olds with family members 


Children in all households 


Adjusted telephone households 


Frequency 


Frequency 


None 


1 or 2 


3 or more 


None 


1 or 2 


3 or more 


Read to by family member 














All 


8.5% 


25.5% 


65.9% 


8.9% 


24.7% 


66.4% 


Hispanic 


26.8 


32.0 


41.2 


27.0 


28.5 


44.5 


Black, non-Hispanic 


14.4 


35.1 


50.5 


12.5 


31.8 


55.6 


Nonblack, non-Hispanic 


4.5 


22.6 


72.9 


4.4 


21.9 


73.7 




Within the last 


Within the last 




Month 


Year 


No 


Month 


Year 


No 


Visited a library 














All 


33.7% 


22.3% 


44.0% 


33.6% 


22.2% 


44.2% 


Hispanic 


20.9 


16.9 


62.1 


23.0 


17.6 


59.4 


Black, non-Hispanic 


22.4 


19.1 


58.6 


24.4 


20.1 


55.6 


Nonblack, non-Hispanic 


37.9 


23.8 


38.2 


38.3 


23.7 


38.0 



NOTE: Due to rounding, details may not add to totals. 

SOURCE: Special tabulations from the October 1990 Current Population Survey. 
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3. An Assessment of Data Quality 
from Recorded Interviews 



Overview 

The purpose of this portion of the working paper is to report the results of an evaluation of some 
aspects of the quality of interviews conducted for the 1993 National Household Education Survey 
(NHES:93). The NHES:93 consisted of two components: the School Readiness (SR) component which 
was administered to parents of children 3 to 7 years old or in second grade or below; and the School 
Safety and Discipline (SS&D) component which was administered to parents of children in grades 3 
through 12, and also administered to youth in grades 6 through 12. 

The evaluation is based on a sample of SR and SS&D interviews that were tape recorded during 
the regular conduct of the NHES:93. In all, 45 SS&D interviews and 25 SR interviews were recorded 
and used in this assessment. 

The evaluation was carried out by applying behavioral coding methods adapted from Oksenberg, 
et al. (1991) to the recorded interviews. Both respondent and interviewer behavior were evaluated, since 
both are indicators of the quality of the interview process. Some measures of the reliability of the coding 
of the behaviors were also included by having two coders assess the same interviews. 

The findings indicate that there were relatively few instances in which the interviewer did not 
follow the prescribed procedures or the respondent did not provide a codeable response. The most 
frequent problem involved interviewers clarifying questions and respondents asking for clarification. 
Other problem areas are noted and potential reasons for these problems are suggested. 

The next section provides some background on the concepts underlying behavioral coding and the 
value of this approach. The methods used in this study are explained in the following section. The 
results of the evaluation are then presented in the next sections, including the analysis of the quality of 
the coding. The last section discusses the implications of the findings for this study and future NHES 
data collection, along with some suggestions for further study. 



Background 

Structured questionnaires, such as the SS&D and SR, depend on the interviewer following strict 
rules of behavior. Questions are to be read exactly as worded. When probing or clarification is needed, 
the interviewer should follow a prescribed sequence of actions (e.g., repeat question, provide non- 
directive feedback). Following this protocol does not allow the interviewer and respondent to follow 
normal rules of conversation. Nonetheless, structure is needed to insure that all respondents are exposed 
to the same measurement process.^ A well designed questionnaire will minimize the awkward nature of 
the interviewer-respondent interaction and insure that all respondents are exposed to the same set of 
questions. If the questionnaire is poorly constructed, respondents will frequently interrupt questions, 
interviewers may be forced to reword questions or provide extensive clarification. These deviations 
from prescribed protocols are considered indications of a poorly designed questionnaire. This, in turn, 
leads to measurement error. 



^For a discussion of the advantages and disadvantages of structured interviews, see Suchman and Jordan (1990). 
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Based on this logic, Oksenberg, et al. (1991) developed behavior coding schemes to pre-test and 
evaluate structured questionnaires. These schemes provide systematic data on the behavior of 
interviewers and respondents to test whether interviewers are systematically deviating from protocols 
and whether respondents can provide data in the expected form without extensive, unstructured (i.e., not 
scripted), interactions with the interviewer. For interviewers, examples of these behaviors include 
whether the question is read exactly as written or changed in some way or needs clarification. For 
respondents, examples include whether the respondent provides a codeable response or asks for 
clarification. To the extent that questions are not read as worded, codeable responses are not provided, 
or clarification is needed, problems may exist with a particular sequence of questions or entire 
questionnaires. 

This behavior coding scheme has been applied in a number of instances to evaluate questionnaires 
(Esposito et al. 1991; Burgess and Paton 1993). This method is useful for revealing a broad range of 
problems that would be directly reflected in interviewer or respondent behavior. The method is limited, 
however, in two important ways. First, it does not provide a reason for why a problem may exist. It only 
provides points in the questionnaire that seem to be leading to problems. Once a question with 
systematic problems is identified, further analysis is required to assess exactly why the question might be 
problematic (e.g., wording too complex, question is too long, question is out of context). Second, the 
method is dependent on the problem being manifested by interviewers or respondents. This, in many 
ways, is a minimal standard to assess data quality. ^ It is reasonable to expect that when interviewers do 
ask the question as worded and respondents do provide codeable responses, respondents may still be 
subject to a wide array of errors (e.g., does not fully understand the question, does not remember 
properly, intentionally conceals information). Despite these limitations, the behavior coding scheme 
used in this study does provide a quantifiable indication of how well the questionnaire facilitates the 
ability of the interviewers to follow intended procedures and the respondents to provide codeable 
responses. 

This study complements two other evaluations of the NHES:93 interviews. The others are the 
report of the quality of interviewer performance based on coded monitoring activities {Design, Data 
Collection Monitoring, Interview Administration Time and Data Editing in the 1993 National Household 
Education Survey, Brick et al. forthcoming) and the report of the reinterview study {Reinterviews in the 
1993 National Household Education Survey, Brick et al. 1996). The use of behavior coding attempts to 
assess the quality of the questionnaire by noting systematic problems associated with deviating from 
prescribed protocols. This contrasts with the evaluation of individual interviewer performance, which 
rates the overall quality of the individual interviewers used on the study. The analysis of the reinterview 
information will provide a measure of the reliability of the responses provided during the interview. To 
the extent that interviewer performance is of high quality and the questionnaire is designed properly, 
measures of reliability should be high. The item-specific reliabilities from this analysis can be used as 
one indication of the seriousness of item specific problems pointed out by the behavioral coding. For 
example, if the behavioral coding points to a particular question sequence as having a large number of 
clarifications required by the interviewers, the analysis of the reinterview data should indicate whether 
these problems are reflected in respondents providing different answers to the same item at different 
times during the interview. 



^Validating survey responses is a long-standing problem associated with any smdy of this type. Short of finding an external 
measure of validity (e.g., school or police records), alternative methods of evaluating the questionnaire (e.g., cognitive 
interviewing) have similar problems associated with obtaining direct measures of measurement error. 
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Method 



Taping the Interviews 

During late February and early March, six interviewers in two of the telephone centers used for 
conducting the NHES:93 were trained to record a sample of extended interviews. The interviewers were 
trained to ask respondents for permission to tape record the interview for use in a special study. If the 
respondent did not feel comfortable with the recording, the interview was not recorded. 

These tape recorded interviews were batched together for later evaluation. Some of the tapes were 
not of sufficient quality for use in this study. The inability to clearly understand the respondents on the 
recorded interviews was the primary reason for discarding some of the tapes from the analysis. In all, 70 
interviews conducted by 6 interviewers were of sufficient quality to permit their use. However, even in 
some of these tapes it was difficult to understand the respondent in different parts of the interview. 



Coding Approach 

Exhibit 3-1 presents a listing of the codes used in this evaluation and their associated definitions. 
This scheme is adapted from Oksenberg, et al. (1991) by deleting and adding a small number of coding 
categories. 

There are 5 codes relevant to interviewer behavior. These categories are: 

1 . Read the question exactly as worded; 

2. Read the question with a minor wording change; 

3. Read the question with a major wording change; 

4. Clarified the question for the respondent; and 

5. Displayed some affect. 

The differences between the first 3 codes is the degree to which the interviewer departed from the 
script. Minor changes include such things as insertion or omission of particular words that the coder 
judges as not altering the meaning of the question. Major changes are those changes that are judged to 
alter the meaning of the question, such as not reading whole parts of the question. The affect category 
was inserted to try to pick up whether particular questions, especially ones that cover sensitive material, 
were difficult for the interviewer to administer in a neutral manner. This code is not part of the 
Oksenberg, et al. (1991) scheme. 

Respondent behavior was coded using 6 categories. These categories are: 

1 . Gave a "correct" response; 

2. Interrupted the interviewer before completing the question; 

3. Clarified the question; 

4. Qualified the answer with respect to accuracy; 

5. Did not provide an adequate answer; and 

6. Expressed sensitivity to the question. 





Exhibit 3-1. --Behavior coding indicator definitions 


INDICATOR 


DEFINITION 


INTERVIEWER 




EXACT 


Reads question exactly as printed. 


MINOR 


Reads question changing a minor word (the, an, this) that does not alter 
the question meaning. 


MAJOR 


Changes wording of the question such that the meaning is altered. 
Interviewer does not complete reading the question as it is written. 


CLARIFY 


Interviewer provides clarification when evident the respondent does not 
understand question. 


AFFECT 


Interviewer demonstrates inappropriate affective responses (e.g., 
laughing) or leading responses/behaviors. 


RESPONDENT 




CORRECT 


Respondent answers question correctly. Respondent answers question 
with a codeable behavior. 


INTERRUPT 


Interrupts initial question reading with answer. 


CLARIFY 


Asks for repeat or clarification of question, or makes statement 
indicating uncertainty about question meaning. 


QUALIFY 


Answer meets question objective but is qualified by the respondent 
indicating uncertainty about accuracy. 


NOT ADEQUATE 


Answer does not meet question objective. 


SENSITIVE 


Respondent demonstrates discomfort in responding to question. 



SOURCE: U.S. Department of Education, National Center for Education Statistics, National Household Education Survey (NHES), spring 



Providing a "correct" response simply means that the response fit into one of the pre-coded 
response alternatives. This code does not actually measure whether the data correspond to some external 
measure of validity. It is the opposite of category 5 (not providing an adequate answer). 

The data were collected by having two project staff members listen to a taped interview and code 
each question and/or response using the codes described above. Coders indicated whether or not a 
behavior was exhibited during the asking (interviewer behavior) or responding (respondent behavior) to 
each question by checking the relevant code in the space provided on their coding form (see 
Appendix A). The coders placed a check mark on all of the appropriate behavior categories exhibited for 
each questionnaire item. 

Interviewer and respondent behavior within a question could involve multiple interactions. In this 
case, multiple codes were recorded. For example, the interviewer may have made minor changes 
[Minor] to the question wording and also provided clarification [Clarify] to the question. Similarly, the 
respondent may have asked the interviewer for clarification [Clarify] about the question, but ultimately 
provided the correct [Correct] response to the question. 

Of the 70 interviews available for analysis, 56 were listened to by only one of the coders, while 14 
were coded by both individuals. Each coder listened to 15 SR interviews, 15 SS&D parent interviews, 
and 12 SS&D youth interviews. The 14 interviews that were coded by both coders included 5 SR 
interviews, 5 SS&D parent interviews, and 4 SS&D youth interviews. 

The relatively small number of interviews that were coded for each questionnaire does not permit 
us to make statistically precise statements about differences between either individual items or 
questionnaires. Consequently, the analysis will concentrate on pointing out general patterns in the data 
that indicate systematic problems with the questionnaire. 



Coder Training and Coding 

Two project staff were trained in the coding procedures by a senior project member with 
experience in questionnaire design. One person was a telephone interviewer very familiar with computer 
assisted telephone interviewing (CATI) and the problems often encountered in conducting such 
interviews.^ The other coder was a research assistant familiar with questionnaire coding procedures and 
common coding problems, but not experienced in telephone interviewing. 

For training purposes, both coders and the trainer listened to one tape from each type of interview 
(i.e., SS&D parent, SS&D youth, and SR) as a group. After each question, the codes were discussed and 
decisions were made aloud regarding how to evaluate both interviewer and respondent behaviors with 
respect to the codes. Review of the tapes and the coding definitions continued until both coders felt 
comfortable in their understanding of the code definitions and procedures to follow. Training was 
completed in a few hours. 

The data from the coding forms were keypunched. A fifth of the sample was then extracted and 
examined by hand. Any keypunching inconsistencies identified during this process were checked and 
reentered. Logical consistency checks were also performed on the entire data-set. For example, a case 
with codes of both correct and not adequate were obviously incorrect. When such problem cases were 



^This individual did not administer any of the NHES;93 interviews. 
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identified, the hard copy rating forms were checked, and in some cases, the actual interviews were 
reviewed to identify the correct codes. 

Coder Reliability 

In this section, the results of the study of the reliability of the coding is quantified using the 14 
cases that were completed by both coders. In the next section, the measures of interviewer and 
respondent behavior, by questionnaire, are described. 

In order to examine the level of agreement between the two coders, interrater reliabilities were 
computed. As mentioned earlier, of the interviews coded, 14 were listened to by both coders (5 SR; 5 
SS&D-parent; 4 SS&D-youth). Those interviews coded by both raters were subjected to an interrater 
reliability analysis. (The interviews used for training purposes were not included.) 

Table 3-1 displays the number and percentage of agreement (and disagreement) in the 
questionnaire items coded by form. As can be seen, the overall level of agreement (including interviewer 
and respondent ratings) ranges from 48% (SS&D-parent) to 68% (SS&D-youth). The ratings for 
interviewer behavior (table 3-2) exhibit less reliability than ratings for respondent behavior (table 3-3). 
On the interviewer side, the lowest agreement was found on the SS&D-parent questionnaire (58%), and 
the highest agreement was found on the SS&D-youth (76%). On the respondent side, levels of 
agreement ranged from 83% (SS&D-parent) to 90% (SR). 

The distinction between interviewers wording the question with the "exact" words as opposed to 
with "minor" changes seemed to be the hardest distinction for the coders to make. This is illustrated in 
table 3-4, which provides the frequency with which each coder assigned either of these two codes. As 
can be seen. Coder 2 assigned "exact" more often than Coder 1. The opposite is the case with the use of 
"minor" changes. From debriefing the coders, one key reason for this difference was that when the 
interviewer paused for significant amounts of time during parts of the same question. Coder 1 tended to 
code this as minor, whereas Coder 2 coded it as exact. 

Since the differences between minor and exact were small from a definitional viewpoint, a 
decision was made to collapse the two codes and recompute the reliabilities. As expected, the interrater 
reliabilities substantially increased as a function of the collapsing of these two codes. As can be seen in 
table 3-5, the overall agreement increased for SR from 67% to 83%, for SS&D-parent from 48% to 78%, 
and for SS&D-youth from 68% to 84%. Table 3-6 presents the reliabilities of the interviewer ratings 
made by both coders. These findings also show substantial increases in rater agreement. 



Table 3-1.— Overall level of agreement (interviewer and respondent) of ratings 



Form 


Number 
of forms 


Number of questions 


Rated the same 


Rated differently 


Total 


Number 


Percent 


Number 


Percent 


Number 


SR 


5 


439 


67% 


217 


33% 


656 


SS&D Parent 


5 


290 


48% . 


316 


52% 


606 


SS&D Youth 


4 


228 


68% 


105 


32% 


333 



SOURCE: U.S. Department of Education, National Center for Education Statistics, National Household Education Survey (NHES), spring 
1993. 



Table 3-2.— Level of agreement of ratings for interviewer behavior 



Form 


Number 
of forms 


Number of questions 


Rated the same 


Rated differently 


Total 


Number 


Percent • 


Number 


Percent 


Number 


SR 


5 


468 


71% 


188 


29% 


656 


SS&D Parent 


5 


350 


58% 


256 


42% 


606 


SS&D Youth 


4 


252 


76% 


81 


24% 


333 



SOURCE: U.S. Department of Education, National Center for Education Statistics, National Household Education Survey (NHES), spring 
1993. 



Table 3-3.— Level of agreement of ratings for respondent behavior 



Form 


Number 
of forms 


Number of questions 


Rated the same 


Rated differently 


Total 


Number 


Percent 


Number 


Percent 


Number 


SR 


5 


592 


90% 


64 


10% 


656 


SS&D Parent 


5 


503 


83% 


103 


17% 


606 


SS&D Youth 


4 


290 


87% 


43 


13% 


333 



SOURCE: U.S. Department of Education, National Center for Education Statistics, National Household Education Survey (NHES), spring 



Table 3-4.— Number of exact/minor codes by rater and form 





Exact 


Minor 


Combined 


CODER 1 








SR 


419 


229 


648 


SS&D Parent 


224 


379 


603 


SS&D Youth 


216 


114 


330 


CODER 2 








SR 


518 


131 


649 


SS&D Parent 


434 


168 


602 


SS&D Youth 


260 


72 


332 



SOURCE: U.S. Department of Education, National Center for Education Statistics, National Household Education Survey (NHES), spring 
1993. 



Table 3-5.-Overall level of agreement of ratings after collapsing "minor" and "exact" codes 



Form 


Number 
of forms 


Number of questions 


Rated the same 


Rated differently 


Total 


Number 


Percent 


Number 


Percent 


Number 


SR 


5 


543 


83% 


113 


17% 


656 


SS&D Parent 


5 


475 


78% 


131 


22% 


606 


SS&D Youth 


4 


281 


84% 


52 


16% 


333 



SOURCE: U.S. Department of Education, National Center for Education Statistics, National Household Education Survey (NHES), spring 



Table 3-6.-Level of agreement of ratings for interviewer behavior after collapsing "minor" 
and "exact" codes 



Form 


Number 
of forms 


Number of questions 


Rated the same 


Rated differently 


Total 


Number 


Percent 


Number 


Percent 


Number 


SR 


5 


581 


89% 


75 


11% 


656 


SS&D Parent 


5 


561 


93% 


45 


7% 


606 


SS&D Youth 


4 


312 


94% 


21 


6% 


333 



SOURCE: U.S. Department of Education, National Center for Education Statistics, National Household Education Survey (NHES), spring 
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Table 3-7 lists the frequency of each type of rating by coder for the 14 cases that were coded by 
both individuals. At least in terms of the distribution of codes within coder, there are slightly more 
differences on the respondent side than the interviewer side. For interviewer behavior, the distributions 
are nearly identical. The only minor exception to this is a slightly smaller number of codes assigned by 
Coder 2 to the "clarify" category. When coding respondent behavior, the "correct", "interrupt" and 
"clarify" categories are very similar. There are, however, differences within the "qualify" and "not 
adequate" categories. Coder 1 assigned many fewer responses in the qualified (92 vs. 169) and "not 
adequate" categories (38 vs. 63). The results would seem to indicate that the coders generally agreed on 
each type of interviewer behavior. On the respondent side, they agreed when a correct answer was given, 
but had differences in how the "qualify" and "not adequate" categories were used. 



Table 3-7.— Total number of ratings per rating category 



Behavior Codes 


CODER 1 


CODER 2 


SR 


SS&D 

Parent 


SS&D 

Youth 


Total 


SR 


SS&D 

Parent 


SS&D 

Youth 


Total 


INTERVIEWER 


















Exact/Minor 


648 


603 


330 


1,581 


649 


602 


332 


1,583 


Major 


8 


4 


3 


15 


8 


4 


2 


14 


ClJU’ify 


79 


90 


41 


210 


72 


69 


31 


172 


Affect 


47 


9 


3 


59 


40 


12 


9 


61 


RESPONDENT 


















Correct 


569 


530 


290 


1,389 


564 


522 


289 


1,375 


Interrupt 


16 


30 


2 


48 


19 


25 


3 


47 


Clarify 


30 


22 


12 


64 


29 


19 


13 


61 


Qualify 


7 


68 


17 


92 


36 


no 


23 


169 


Not Adequate 


9 


19 


10 


130 


17 


25 


21 


63 


Sensitive 


1 


0 


0 


1 


0 


0 


0 


0 



SOURCE: U.S. Department of Education, National Center for Education Statistics, National Household Education Survey (NHES), spring 
1993. 



Findings 

Overall Ratings by Questionnaire 

Table 3-8 presents the frequency with which each code \Vas assigned for each questioimaire. The 
universe of cases included in this, and each following, table is as follows: SR=25; SS&D-parent=25; and 
SS&D-youth=20. Fourteen of these 70 tapes had been coded twice (once by each coder) and used in the 
reliability analysis. For these 14 cases, half were included from each coder. 

As can be seen from the table 3-8, exact and minor were the codes used the most frequently for 
interviewer behavior. Across all questions, these codes accounted for 85 percent to 89 percent of all the 
assigned codes. Interviewers were slightly more likely to read the item exactly on the SR questionnaire 
(70 percent) when compared to the other two questionnaires (62 percent for SS&D-parent and 66 percent 
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for SS&D-youth). The most prevalent of the other problem codes was "clarify" which occurred between 
9 percent and 11 percent of the time. There were relatively few "major" codes assigned. The 
questionnaire with the highest percentage of these problems was the SS&D-parent, where this code was 
assigned 51 times (1.5 percent). 

The vast majority of codes for questionnaire items indicate that the respondents answered 
correctly (i.e., provided codeable responses). More problems were evident in the codes for SS&D 
parents respondents; this is consistent with the findings for interviewer behavior, discussed above. The 
greater incidence of problems identified in the SS&D parent interviews is evident in the lower 
percentage of “correct” codes and the higher percentages of “not adequate” or “qualify” codes. 



Table 3-8.— Total number of codes given by form 





FORM 


SR 

(N=25) 


SS&D Parent 
(N=25) 


SS&D Youth 
(N=20) 


Number 


Percent 


Number 


Percent 


Number 


Percent 


INTERVIEWER 














Exact 


2,654 


70.0% 


2,095 


62.3% 


1,231 


65.6% 


Minor 


658 


17.4 


772 


23.0 


440 


23.4 


Major 


21 


0.6 


51 


1.5 


7 


0.4 


Clarify 


371 


9.8 


382 


11.4 


171 


9.1 


Affect 


88 


2.3 


63 


1.9 


28 


1.5 


All Codes 


3,792 


100.0 


3,363 


100.0 


1,877 


100.0 


RESPONDENT 














Correct 


2,738 


86.9% 


2,402 


79.7% 


1,411 


87.5% 


Interrupt 


66 


2.1 


,90 


3.0 


6 


0.4 


Clarify 


126 


4.0 


118 


3.0 


41 


2.5 


Qualify 


140 


4.4 


301 


10.0 


93 


5.8 


Not Adequate 


70 


2.2 


93 


3.1 


62 


3.8 


Sensitive 


12 


0.4 


8 


0.3 


0 


0.0 


All Codes 


3,152 


100.0 


3,012 


100.0 


1,613 


100.0 



SOURCE: U.S. Department of Education, National Center for Education Statistics, National Household Education Survey (NHES), spring 



Evaluation of Specific Survey Items 

The next set of analyses evaluate the items for each questionnaire. For this analysis, we examine 
the frequency with which questionnaire items exhibited a high percent of behavior codes other than 
"exact" or "minor" (for interviewer behavior) or "correct" (for respondent behavior). In addition, for 
those questions that exhibit a high percent of "major" interviewer problems, the comments provided by 
the coders are discussed. These are used to diagnose potential reasons why a problem occurred and 
develop preliminary recommendations. 

For those items coded as problematic because of the high prevalence of some other type of 
respondent/interviewer behavior (e.g., clarify and affect for interviewer behavior, and clarify and qualify 
for respondent behavior), there is no analysis of comments provided by the coders. This is the case 
because, in large part, the coders did not consistently write down comments when using the other codes. 
Consequently, this portion of the evaluation only provides an indication that some type of problem 
exists. In order to pinpoint the reasons for the problems observed, it would be necessary to go back to 
the tape recorded interviews and listen to those portions of the interview that exhibit the problematic 
patterns. 

The next four subsections focus on the different questions and questioimaires. First we discuss 
introductory items across all three questionnaires, and then analyze the non-introductory items for SR, 
the non-introductory items for SS&D-parent, and finally the non-introductory items for SS&D-youth. 



Introductory Items 

Table 3-9 presents the frequency of ratings for the introduction sections of each survey. 
Introductions are important because they provide smooth transitions between topics of the questionnaire. 
They inform the respondent that the topic is going to shift and provide the respondent with an idea of 
what is coming next. Training for the NHES:93 placed special emphasis on the need to read these 
introductions exactly as worded. 

As can be seen from the table, interviewers, for the most part, read the introductions exactly or 
with minor revisions. A major change to introduction wording was indicated only once on the SR 
questionnaire (ECINTRO). Interviewers clarified the introduction in three instances, all in the SR 
questionnaire (i.e., KINTR02, HAINTRO, TVINTRO), and displayed inappropriate affect six times on 
the SR questionnaire (three of which were LFINTRO) and two times on SS&D-parent (both on 
PINTRO). 

While the reliability analysis discussed above indicated that one should not distinguish between 
"exact" and "minor" codes, it is worth noting that the introductory statements have extremely high 
numbers of "minor" problems associated with them. For example, for the SR questionnaire, the overall 
ratio of numbers of exact to minor codes for the introductions is 1.7. This compares to a ratio of 4 for all 
items on the questionnaire (2654 to 658). There are several introductions where half or more had a 
minor change (RINTRO, DPINTRO, ECINTRO, HAINTRO, TVINTRO). For several of these, a small 
number of "major" and "clarify" codes are also present. 



Table 3-9.— Frequency of rating on introductions 



Rating 



FORM/QNUM 


Exact 


Minor 


Major 


Clarify 


Affect 


SR (N=25) 

INTRO 


1 


2 


0 


0 


0 


RINTRO 


6 


15 


0 


0 


0 


DPINTRO 


6 


7 


0 


0 


0 


ECINTRO 


13 


11 


1 


0 


0 


SAINTRO 


9 


3 • 


0 


0 


0 


TEACHINT 


9 


3 


0 


0 


0 


KINTROl 


8 


5 


0 


0 


1 


KINTR02 


10 


2 


0 


1 


0 


PINTRO 


5 


2 


0 


0 


0 


HAINTRO 


14 


11 


0 


1 


0 


TVINTRO 


14 


10 


0 


1 


1 


HNINTRO 


19 


6 


0 


0 


0 


PKINTRO 


8 


4 ■ 


0 


0 


1 


LFINTRO 


18 


7 


0 


0 


3 


ARINTRO 


13 


6 


0 


0 


0 


HINTRO 


15 


4 


0 


0 


0 


Total 


168 


98 


1 


3 


6 


SS&D PARENT rN=25') 

INTRO 


0 


2 


0 


0 


0 


PINTRO 


3 


16 


0 


0 


2 


SSINTRO 


17 


8 


0 


0 


0 


SDINTRO 


20 


5 


0 


0 


0 


TADINTRO 


15 


6 


0 


0 


0 


CCINTRO 


18 


7 . 


0 


0 


0 


LFINTRO 


9 


9 


0 


0 


0 


HINTRO 


12 


6 


0 


0 


0 


Total 


94 


59 


0 


0 


2 


SS&D YOUTH rN=20') 

YINTRO 


1 


7 


0 


0 


0 


SSINTRO 


16 


3 


0 


0 


0 


TADINTRO 


17 


2 - 


0 


0 


0 


Total 


34 


12 


0 


0 


0 



SOURCE: U.S. Department of Education, National Center for Education Statistics, National Household Education Survey (NHES), 
1993. 
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The tendency to change the wording of these introductions may reflect the need of the interviewer 
to adapt the transition to the specific context of what is being said by the respondent at the time. If this is 
the case, it may be worth considering rewording those introductions that have the highest rate of minor 
problems. It may also reflect old habits interviewers may have. In many surveys, introductions, 
especially those at the time of initial contact, are given as guides, rather than as items to be read 
verbatim. Given this possibility, it may be worth taking a second look at the training materials and place 
even more emphasis on reading these items exactly as worded. 

School Readiness Questions 

Appendix B presents the frequencies for the behavioral codes for each questionnaire item on the 
SR instrument. Overall, there are relatively few questions that received a "major" change in question 
wording by the interviewer or a "not adequate" response by the respondents. Across all questions and the 
25 cases that were coded, the major category was only used a total of 20 times. With one exception, no 
question received this rating more than once. 

The exception was question number R93. This question focused on the number of hours of 
television viewing by the child on Saturday and Sunday. Specifically, the wording of the question was as 
follows: 

R93. "How about on Saturday and Sunday? How many hours does (child) watch 
television or video tapes at home on... a. Saturday b. Sunday 

Comments by the coders indicated that the interviewer left out (i.e., skipped) the introduction to 
this item in each of the four instances. This may reflect the fact that the introduction is redundant with 
the answer categories. 

This question is also embedded within a sequence of items where the interviewer needed to clarify 
the question(s) and where the respondent frequently qualified the answer. The fact that the interviewer 
was dropping the introduction to R93 may be indicative of the fact that respondents were having some 
problems with these items and interviewers had a hard time following the prescribed sequence of 
questions. Since the answer categories in R93 are redundant with the introduction, interviewers may 
have been more likely to skip the introduction to maintain conversational continuity. For example, 
questions R92a,b,c (which concern weekday television viewing hours) and R93a,b (which concerned 
weekend viewing) all had 5 to 9 cases coded as needing interviewer clarification. Similarly, these same 
questions had 1 to 1 1 instances where the respondent somehow qualified his/her answer.'* This indicates 
that respondents were not particularly confident in the quality of the information that they were 
providing on hours of television viewing. 

There were a few other questions that appeared to have a high number of instances that 
interviewers or respondents either had to clarify or qualify statements. The series of questions R51a - 
R51f had 2 to 6 instances of interviewers clarifying responses and 1 to 3 instances where the responses 
were coded as "not adequate". These questions use a set of pre-coded frequency categories: 

R51a. [On the average, during the first two months of this school year, that is last 
September and October,] did (child) complain about school more than once a 
week, once a week or less or not at all? 



'^Remember that these frequencies are based on 25 cases. A question with frequency of 11 "qualify" responses indicates that 
this qualification occurred nearly 50% of the time (11/25). 
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It may be the case that respondents did not understand how to use the answer categories for these 
questions. It may also be the case that interviewers did not carry forward the introduction to this series 
of items (see bracketed phrase above). If this occurred, then the stems of the questions appear to be 
"yes/no" items, rather than providing one of the frequency categories. To confirm this hypothesis it 
would be necessary to review the recordings for the cases exhibiting these problems. 

The question sequence R38 - R39 indicated a relatively large number of clarifications on the part 
of both the interviewer (7 and 6 times) and respondent (1 and 3 times). R38 also had two instances that 
the respondent did not provide an adequate answer.^ These problems may stem from the fact that R38 
contains several qualifying phrases and conditions: 

R38. Not counting child care in private home (or Head Start), how old was (CHILD) in 
years and months when (he/she) first attended any nursery school, pre- 
kindergarten, preschool or day care center? 

This may account for the need of the interviewer to assist the respondent in understanding what is 
being asked and why the respondent asks for clarification. 

Other items that appear to have high instances of interviewer clarification include: R1 (7 times), 
R13 (5 times), R46 (4 times for pre-kindergartners), R55 (9 times), R56 (9 times), R137 (6 times), R167 
(9 times). Several of these items seem straightforward, for example, R1 asks to confirm the child's 
birthdate, R137 covers highest grade completed, and R167 asks for ZIP Code. There does not appear to 
be any associated problems with respondent behavior for any of these questions. It would appear, 
therefore, that while there is quite of bit of clarification for these items, the interviewer and respondent 
do seem to eventually arrive at a response that is both acceptable and not overly qualified by the 
respondent. 



School Safety and Discipline Parent Interview 

Appendix C presents the frequencies for each question on the SS&D-parent questionnaire. As 
discussed earlier, this questionnaire seemed to display the most problems across the three different 
interviews that were examined. The major code was used 51 times (no major code was used on 
introductory sections). 

Questions which received this code 3 to 5 times include: 

P2 (3 timesl Child's race . In two instances, coder comments indicated that the interviewer 
paused halfway through the question. This reflects the interviewer waiting for the 
respondent to verify the child's race after each answer category or volunteer a category 
once understanding the range of possible responses. 

P9 (4 times) Type of father who lives in the household . In two instances, the interviewer 
either paused during the question or did not ask the complete question. This question 
actually contains two different questions ~ Is the father living with the child? If not, who is 
the father figure in the household? It may be worth considering breaking this item up into 
two questions if the item is repeated in a foture NHES collection. 



^These frequencies are quite high considering that these questions only apply to those children who had ever attended some 
type of pre-kindergarten program (see Q.37). 
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P9a (4 times) The name of the father who lives in the household . In two instances, the 
coders commented that the interviewer "led" the respondent to an answer (e.g., the 
interviewer read a specific name off of the household roster). This seems to be a training 
issue. It should be emphasized that interviewers should not read names off of the 
household roster or they should read the entire list. 

PY29 (3 times) Incidence of robbery from students or teachers at school . In two instances, 
the coders commented that the entire question was not completed. In one instance, the 
interviewer paused to allow the respondent to provide an answer. In the other instance, the 
interviewer left out the word "at school". 

PY34 (3 times') Heard of incidents of bullying during school year . In two instances, the 
interviewer did not complete the example portion of the question. In one instance, the 
interviewer paused, which allowed the respondent to interrupt with an answer. In another 
instance, the interviewer simply omitted the example entirely. It may be preferable to 
eliminate one of the two questions asked in the item, e.g., either ask about "bullying" or if 
"students pick on others." 

PY94 (5 times’) Parental feelings about their child drinking alcoholic beverages . Coder 
comments indicate that the interviewer did not complete the question. In three instances, 
the last part of the question, "A small amount on ...." was left out. This question might be 
restructured by prefacing the question with a short qualifying phrase like "Excluding 
special occasions, ...." and delete the last sentence that now has a tendency to be excluded. 
Alternatively, special emphasis could be given in training to make sure the interviewer 
reads the entire question to the respondent before recording the response. 

Other single items that received higher numbers of problematic codes for interviewer behavior 
include PI 07 (education - interviewer clarified 9 times). Pill (hours worked per week - interviewer 
clarified 8 times), PI 22 (zip code - interviewer clarified 11 times). Equivalent items to PI 07 and PI 22 
on the SR questionnaires had similar problems 

In addition to these single items, there were clusters of items with a larger than average number of 
codes that were not "exact", "minor" or "correct". These include: 



Interviewers/Respondents Clarifying, Inadequate Responses 

Items PY92 - PY97. Smoking, drinking and the safety of the respondent's neighborhood . 
These items were higher than average on interviewers clarifying the question or 
respondents interrupting, clarifying or not providing an adequate answer. The most 
extreme example of this is PY95, which was clarified by the interviewer 12 times out of a 
possible 25 cases. These results may be related to the sensitivity surrounding responses 
concerning smoking, drinking, and neighborhood safety. 

Items PY21a - PY23. Experiences of child since beginning of school year, attitude toward 
good grades and behavior . Interviewers clarified a large number of times; respondents also 
asked for clarification, qualified and provided inadequate answers a number of times. 
These items contain two different sets of Likert scales. PY22 switches to a different Likert 
scale. It is not uncommon for respondents to forget the response categories in a series, and 
some problems with noncodeable responses may be alleviated by having interviewers read 
the categories for the first two or three statements in the series. The need for clarification 
may result from some respondents never having given thought to the questions asked about 
school environment, and Aeir request for clarification may be a “stalling tactic.” 



Items PY62a - PY62e. Access to alcohol/drugs while on school grounds . - This item had 
similar problems as PY2 la, except not as extreme. This is also a set of items using a Likert 
scale. As noted above, helping the respondent to “catch on” to the response categories in a 
Likert scale may alleviate some response problems. However, these items concerning 
access to alcohol and drugs are, by their nature, sensitive, and some respondents may be 
reluctant to report on such problems at their (or their child’s) school, or may feel 
uninformed. Under these conditions, requests for clarification may reflect stalling. 

Items PI 3 - PI 9. Characteristics of the school the child is attending . - As with PY92 - 
PY97, these items were high on interviewer clarification. They also resulted in a moderate 
number of instances of respondents clarifying, qualifying and providing inadequate 
answers. P18 and P19 had a large number instances where the respondent interrupted the 
interviewer to answer the question. Lack of knowledge concerning items such as school 
size may lead to requests for clarification and inadequate responses. Regarding the 
interruptions, it is not uncommon for respondents to stop an interviewer who is reading a 
list when the correct answer (e.g., school size) is reached. 

Respondents Qualifying Answers 

Items P55 - P55h. Security measures in school . - Respondents had a tendency to qualify 
their responses to these items. The most extreme case is for item P55e (limits on 
restrooms) in which 15 respondents qualified their answer. Prior to conducting the 
NHES:93, cognitive laboratory activities indicated that parents have imperfect knowledge 
of practices and incidents at their children’s schools. The qualification of answers may 
reflect that respondents are indicating that they are unsure of their answers. 

Items P45 - P47. Incidents that occurred in school: presence of fighting gangs . - These 
items were high on respondents qualifying their responses (P46, P47) and interviewers 
clarifying the question (P45, P47). As noted above, some lack of parent knowledge 
concerning incidents at school was anticipated. Qualification of answers may reflect 
parents communicating that they are unsure of their responses. 

Items P68 - P68d. Alcohol and drug education in school . These items were high on 
respondents qualifying their answers. Again, this may be associated with lack of 
knowledge about practices at the child’s school. 

Two general observations can be made from these findings. First, items with Likert scales are 
leading to additional interactions between the interviewer and respondent. This can be seen especially at 
the beginning of the sequence using a particular response format. Mixing Likert scales may be even 
more confusing. See, for example, the number of clarifications required for PY22 (16 times), which 
switches the format of the Likert scale from what had been used in the PY21 series of questions. To 
resolve exactly why these patterns are occurring and whether they are indicative of serious problems in 
the questionnaire, it might be instructive to listen to those tapes that exhibited the problems again and to 
explore these items in cognitive laboratory work if they are used again in the future. 

When a particular question (or set of questions) has a high number of respondents qualifying 
answers, the question may be either worded poorly or asking for information that respondents do not feel 
comfortable providing. Discomfort might result because the respondent does not know the answer (e.g., 
proxy information on the child's curriculum) or because the information requested is sensitive. 



School Safety and Discipline-Youth 

Appendix D presents the frequencies for each question on the SS&D- Youth questionnaire. 
Comapred to the SS&D questionnaire, this questionnaire had a smaller number of instances in which 
problem codes were used (that is, codes other than exact, minor, or correct). There were only a total of 7 
instances that the "major" category was used. The questions that lead to the most problems are primarily 
the same questions that displayed problems on the SS&D-parent version. These include: PY21-PY23, 
PY29, PY34, PY55 series, PY62 series and PY92-PY96. 

None of the remaining questions have an extremely large number of problem codes associated 
with them. Those that are above average include Y60a - Y60e (interviewer clarifying), Y44c - Y44f 
(interviewer clarifying and respondent qualifying answer) and PY47. 



Implications 

Overall, the results of this analysis indicate that the majority of questions in the three 
questionnaires were read as written by the interviewer (or with only minor revision) and respondents 
provided a "codeable" response. The major exception to this were the introductory items for each section 
of the questionnaire. These items exhibited an unusually high number of instances where there was a 
"minor" change in the wording of the statement. 

The SS&D-parent questionnaire had the highest frequency of problem codes, although a number 
of the questions exhibiting problems were common to the youth version of this questionnaire. We 
speculate that this may be because much of the information that the parent is asked to provide may not 
readily be within his/her knowledge base (e.g., questions on school safety and the school curriculum). 

The most frequent problem code used was when the interviewer had to clarify the question. This 
seemed to be prevalent in a variety of situations. The most common was when a Likert scale was being 
used. 



The specific items that exhibited higher than average problem codes for all three questionnaires 
were provided in the tables and text. To explore the exact nature of these problems and the associated 
methods to eliminate the problems would require going back to the specific question items discussed 
above and getting a more detailed diagnosis of why the problems are occurring. Should these same items 
or instruments be used again, these questions could be further evaluated, either from the recorded 
interviews or in cognitive laboratory investigations, before they are used in future studies. These 
evaluations are needed to better understand the consequences of the behaviors noted in this report. 
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FREQUENCY OF RATING FOR 
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Table A. Frequency of Rating: SR(N=25) 
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Table B. Frequency of Rating: SS&D Parent (N=25) 
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Table C. Frequency of Rating; SS&D YOUTH (N=20) 
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